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NOTES . . 



from the Editor 



What constitutes a research study? This question 
becomes crucial in selecting articles and documents to 
be abstracted for Investigations in Mathematics Education . 
We believe that readers of this journal will be best served 
by a broad interpretation of that issue rather than a 
narrow definition. We have abstracted evaluative studies 
of large curriculum projects or teacher-training programs 
even though these projects and programs did not show 
evidence of the careful control of variables usually 
associated with research studies. Evaluations of math- 
ematics education programs call for practical applications 
of many research techniques in settings not particularly 
amenable to classic research design. Analyses of studies 
where research techniques have been applied to evaluate 
existing programs can provide important insight into 
these special problems. 

The first abstract in this issue discusses a study 
which looks carefully at a particular research technique 
(item- sampling) as applied to formative curriculum evalu- 
ation. As our abstracter points out, this particular 
study is not experimental. But it does provide guidelines 
for the use of the item- sampling technique in other 
formative evaluation studies. The application of this 
particular technique is important to all researchers. 

We believe it is a good example of the advantage to be 
gained by adopting a broad view of research when selecting 
articles for this journal. 

We appreciate comments from our readers concerning 
the coverage of articles abstracted in this journal. 

Readers who would like to see specific reports or documents 
abstracted by Investigations in Mathematics Education 
are encouraged to write the editor. 



Jon L. Higgins 
Editor 
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Expanded Abstract and Analysis Prepared Especially for 
I.M.E. by Arthur F. Coxford, University of Michigan. 



1. Purpose 

To investigate the application of the technique of 
item-sampling to formative curriculum evaluation in 
mathematics . 



2 . Rationale 

In the item-sampling technique, a set of n items is 
randomly partitioned into r subsets. The r subsets of 
items are randomly assigned to s subjects so that each 
subject responds to only a subset of the items. Theoreti- 
cally the descriptive statistics obtained for an item, a 
subset of items or the entire set of n items are estimates 
of the respective population descriptive statistics. There 
is evidence which suggests that means obtained by item 
sampling techniques may be significantly greater than 
means obtained by conventional procedures. Even so, when 
subjects are exposed only to item sampling techniques, 
the conditions which influence performance are assumed 
uniform and inflated means are much less important. 

In mathematics curriculum development a great number 
of objectives are sought over a year. In formative eval- 
ation of a mathematics program, information on all these 
objectives is desirable at several times during the year 
so that the curriculum developer may identify weaknesses 
and institute correctional procedures. These needs cannot 
be met by conventional testing procedures. They may be 
satisfied by the item-sampling technique because all ob- 
jectives can be measured several times during the year 
without having every subject respond to every test item. 
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In applying an item-sampling technique to formative 
curriculum evaluation in mathematics several interrelated 
questions need to be answered. 

1) What sample size is needed? 

2) What accuracy of the sampling estimates is 
desirable and obtainable? 

3) How many items need to be used? 

4) How many items per test are desirable and feasible? 



3. Research Design and Procedure 

The mathematics program which was formative ly evalu- 
ated was the sixth grade portion of Patterns in Arithmetic 
(PIA) which is made up of 64 fifteen minute TV sessions 
with pre and post activities, Teacher Notes and Pupil 
Exercises. Two TV lessons were viewed each week by the 
students in 62 participating classrooms within a fifty 
mile radius of Madison, Wisconsin. 

Twelve 20 item tests were developed for use in the 
evaluation. Testing was done four times during the 1968- 
69 school year. The first administration (T^) followed * 
program 5 (September) of PIA, T 2 followed program 20 
(November) , T 3 followed program 41 (February) , and T 4 
followed program 63 (May) . At each administration every 
student completed a test, all tests were administered and 
a student completed a different test at each administration. 

Each evaluative instrument contained twenty items. 

There were thirteen multiple choice items and seven free 
response (work out answer and record it) items. Each test 
had the same directions for administration. In strict 
item-sampling situation, items are randomly assigned to 
tests. In the present study a pool of items was con- 
structed in June 1968, partitioned into homogenous (by 
content area) subsets, and each test was constructed by 
selecting items from a variety of content areas and a 
variety of difficulty levels. The aim was an interesting, 
informative, and balanced test. Thirty-five minutes was 
hypothesized to be sufficient for test completion by al- 
most all students. 

The first administration of the twelve tests was pre- 
scribed by the researcher. Participating teachers were 
required to follow directions carefully so that each test 
was taken by two or three students in each class. Upon 
receipt of the student rosters along with indication of 
the test completed by each student, the investigator as- 
signed tests to be completed by each student at T 2 , T 3 
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and T 4 . The assignment was random within the restriction 
that no student should take the same test twice. 

In order to provide a basis for making judgments con- 
cerning the effectiveness of PIA, each item measuring a 
program objective was classified into one of 5 categories. 
These categories were (1) Mastery level 1 — Item easy for 
most students at the end of the year; (2) Mastery level 
2 — Area received strong emphasis during the year yet high 
level mastery is not expected; (3) Mastery level 3 — Items 
represent more complicated aspects of content in PIA as 
well as problems which are conceptually difficult and 
computationally complicated; (4) Transfer level 1 — Items 
involve minor extensions of concepts; and (5) Transfer 
level 2 — Items are usually conceptually difficult., repre- 
sent extension of program content and contain difficult 
computations. A lower bound of acceptable end of year 
performance was arbitrarily set for each of these cate- 
gories. 



4. Findings 

1) The average class time needed to complete the tests 
at T^ , T 2 , T 3 and T 4 was 23.8 min., 24.8 min., 23.9 
min., and 23.8 min. 

2) Four items appeared on two different tests. Analy- 
sis of the response rates for these four items demon- 
strated that 120 random responses produced a reasonably 
stable estimate of item difficulty. 

3) A growth profile of correct response rates was 
constructed for each item. 

4) The 240 items administered at each testing period 
were partitioned by content area to aid interpretation. 
The content area results formed the basis for decisions 
made regarding changes in PIA. 

5) On the basis of the test results 

a) Major revisions in two TV programs of PIA were 
deemed necessary. 

b) Several areas of weakness were identified, i.e. 
measurement, problem solving, long division, 
and ratio. 

6 ) The item pool was weak in that it did not include 
items representative of all major objectives of PIA. 



5. Interpretations 

1) Large variations in responses to single items 
should not be considered as firm evidence that program 
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change is needed. Rather, a set of related items 
should be examined if single items suggest the possi- 
bility of a problem. 

2) The item pool used in item-sampling must include 
only items which provide information relative to the 
objectives of the program being evaluated. 

3) Test instruments should be constructed so that 
every pupil will have the opportunity to respond to 
each item. 

4) Test items should be classified prior to their 
use in evaluation. A minumum number of categories 
should be used and the final level of performance for 
each category indicated by setting a lower bound 
criterion for each category. 

5) An aid in making the testing results useful is to 
group items into content areas and to indicate, via 

a code, the type of instructional emphasis on each 
bit of content occurring between testing periods. It 
is recommended that two symbols be used — one for ex- 
tensive coverage in the program and one for signifi- 
cant review. 



Abstracter's Notes 



This is not a report of an experimental study. It is 
a careful discussion of the technique of item-sampling as 
applied to formative evaluation. The author has provided 
well reasoned arguments for variations from the completely 
random assignment of items to tests and tests to subjects. 
He has provided good guidelines to other curriculum de- 
velopers for the successful use of the item-sampling tech- 
nique in formative evaluation. 



Arthur F. Coxford 
University of Michigan 
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Expanded Abstract and Analysis Prepared Especially for I.M.E. 
by Jerry P. Becker, Staff Associate (Mathematics), National 
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(on leave during 1971-72 from Rutgers University,, New 
Brunswick, N.J.) 



1. Purpose 

To investigate the ability of young children to compre- 
hend the relationships that exist between a solid and its 
representations expressed in the form of sketches or 
photographs . 

2. Rationale 



Various ideas, concepts, and topics are regarded as im- 
portant in the intellectual development of a child. An 
important question, in this regard, is: What is the best 

time at which these ideas and topics can be comfortably 
acquired? Also, are there factors in the perceptual 
development of some young children which enhance or in- 
hibit the growth of their ability to associate different 
size solids with representations of a given size? 

3. Research Design and Procedure 

Five solid shapes were used in this investigation: right 

circular cylinder, sphere, ellipsoid, square based rectangu- 
lar parallel piped, and cube. These shapes were used because 
the possibility exists that one of them might be mistaken, 
on the basis of only some of its defining characteristics, 
for another solid in the set. As an example, failure to 
consider height could result in selection of a rectangular 
solid as a cube, or vice-versa. To introduce an additional 
factor — that of size — four solids of each shape were used: 
one small, two medium, and one large. Solids were constructed 
so that the height of the small rectangular solid and the 
length of the base of the middle sized one were each the 
same as the length of an edge of the middle sized cube. 
Similarly, the height of the middle sized rectangular solid 
and the length of the base of the large rectangular solid 
were each the same as the length of an edge of the large 
cube . 
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A color photograph and a black line sketch of a middle 
sized solid of each shape were used as classification 
stimuli. The sketch and photograph of each solid were 
fastened together, back to back, and hung on a string 
fastened to a container in which the solids could be 
placed. (This procedure facilitated switching from one 
representation to another half way thru the test.) A 
randomized order was used in presenting solids to the 
children for classification. 

Seventy one three-year-olds and 58 four-year-olds were 
tested, all individually. Children were to find the 
representation which they felt depicted the solid they 
were holding and then place the solid in the box with the 
chosen representation. Selections were recorded, compiled, 
and later analyzed in three ways: item analysis, Paired T 

Test, and multivariate analysis. 

4 . Findings 

An item analysis showed that correct classification of 
cubes and rectangular solids were most difficult for the 
children, whereas, cylinders, spheres, and ellipsoids were 
relatively easy to classify. 

Out of a possible 20 items, the mean score for subjects 
was 17.43, indicating that the test was easy for many of 
the children. The distribution was negatively skewed and 
exhibited a strong ceiling effect. 

The Paired T Test was used to compare several partial 
scores for each child, with results showing that three- 
and four-year-olds associate a solid equally well with 
its photographic or sketch representation. Children could 
more easily classify solids that were the same size as the 
solid in the sketch or photograph. Although both large and 
small size solids were difficult for children to classify, 
neither was significantly more difficult than the other. 

A multivariate analysis was carried out, with the following 
factors considered: (1) Teacher and experience (whether or 

not the student had formal lessons pertaining to shape) , 

(2) Order of presentation (sketches or photographs first) , 

(3) Socio-economic level, (4) IQ. Results showed that the 
Teacher and the Experience of the child with an instruc- 
tional unit dealing with shape were the most influential 
factors in the child's performance. It was not possible to 
differentiate between Teacher and Experience or to ascer- 
tain which, if either, made the more significant contribu- 
tion to the child's performance. Results showed, however, 
that there is high potential effectiveness in teaching 
three- and four-year-olds to identify shapes and that the 
child's performance in the classification of any medium or 
large solids and of cubes of any size will serve as a good 
indicator of the experience (or teacher) with which the 
child was associated. 
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IQ also affected performance of the children. Small 
solids of any shape as well as any cylinder can be used 
to discriminate between IQ groups. Medium IQ children 
(between 90 and 110) and high IQ children (above 110) 
scored significantly better than children with low IQs 
(below 90) . 

When order of presentation was significant, higher scores 
were achieved by children who first observed photographs 
for the placement of the 10 solids, and then observed 
sketches for the remaining 10 solids; but only rectangular 
solids provided a basis for discrimination between children. 

Children were classified into one of five socio-economic 
levels using the Hollingshead Two Factor Index; however, 
socioeconomic level was not a significant factor in any of 
the comparisons. 

5. Interpretations 

The study showed that three- and four-year-old children 
are able to classify solids by either pictures or sketches 
of the solids. Their ability to identify solids correctly 
appears to be influenced by a reasonable training program 
and the IQ level of the children. Further, placement of 
solids appears to be dependent on physical characteristics 
of the solids being classified and data show that the 
children had the most difficulty differentiating between 
cubes and rectangular solids. 

Abstracter' s Notes 

Studies such as this, in which a child's acquisition of 
mathematical ideas and relationships is studied, are always 
of interest to mathematics educators. However, a clear 
cut rationale for the study is not stated, nor does it seem 
obvious. For example, it is not clear what the results of 
such research can tell us about the intellectual development 
of a child. Nor is it clear how such results fit into 
the picture of a child's mathematical thinking development. 
And what are the implications for curriculum development? 
These are questions that might be discussed so that mathe- 
matics educators get a better understanding of how the 
research can help us and be used. 

A mean of 17.43 out of a possible 20 items seems quite 
high. Accordingly, it is not clear why an extensive 
analysis of the data is made, leading to the conclusion 
that the test was easy for many of the children. 

It was found that three- and four-year-olds associate 
solids equally well with their photographic or sketch 
representations. But what might be the characteristics of 
the photographic and sketch representations which might 
lead us to hypothesize otherwise? Also, it was found that 
the Teacher and the Experience of the child with an instruc- 
tional unit dealing with shape was the most influential 



factor in a child's performance. While not being able to 
differentiate between these factors, the author goes on 
to mention that a child's performance in the classifica- 
tion of any medium or large solids and of cubes of any 
size will serve as a good indicator of the experience (or 
teacher) with which the child was associated. I am not 
sure of the value of this observation nor of the degree to 
which it might be generalized. 

IQ is found to affect the performance of a child. However, 
IQ is a general kind of construct, which may have many 
components. In particular, spatial ability is a part of 
the more general concept of IQ and may have played a role 
in the performance of children on the test. For example, 
might not children with "high" spatial ability be able to 
perform consistently better than children with "lower" 
spatial ability? In general, examination of the role of 
particular aspects of mental ability in performance on 
mathematical tasks may be more revealing than by examining 
the role played by more general IQ measures. 

Jerry P. Becker 

Rutgers University 
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Georgia Univ. , Athens,; Purdue Univ. , Lafeyette, Ind. 

Pub Date Feb 71, Note — lip.? Paper presented at the 
Annual Meeting of the American Educational Research 
Association (Feb. 4-7, 1971, New York City, N.Y.) EDRS 
Price MF-$0 .65 HC-$3.29 

Descriptors — ‘Concept Formation, ‘Elementary School 
Mathematics, ‘Mathematical Vocabulary, Mathematics 
Education, ‘Reading Research, ‘Secondary School 
Mathematics . 

Expanded Abstract Prepared Especially for I.M.E. by 
L. D. Nelson, University of Alberta. 



1. P urpose 

To develop a difficulty measure of mathematical terms 
and mathematical symbols as a step in the development of 
readability formulas appropriate for mathematical materials. 



2 . Rationale 



The level of vocabulary difficulty in reading material 
is usually determined by comparing the words in the mater- 
ial with a list of words having a certain familiarity or 
frequency of use. Quite serious problems arise, however, 
when the material contains a large proportion of special- 
ized vocabulary such as is found in mathematics textbooks. 
This vocabulary is made up of words which may have gener- 
al meaning but are used in mathematics in specific con- 
texts (e.g. set); which may have mathematical meanings 
different from ordinary meanings (e.g. field); which may 
have meaning only in mathematics (e.g. perimeter) ; and 
the like. There is also a complex system of symbols 
(e.g. the square root sign). It was to determine a mea- 
sure of difficulty for such mathematical terms and sym- 
bols that the authors conducted this study. 



3. Research Design and Procedure 

It was proposed to obtain measures of familiarity of 
mathematical terms and mathematical symbols of seventh 
and eighth grade students in the United States. To do 
this the authors proceeded to develop two measuring 
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instruments - one for mathematical terms and one for 
mathematical symbols. If the student could remember a 
definition, give an example, or give an explanation in 
his own words he was deemed to be familiar with a term. 

The student was the judge and would check either "know" 
or "don't know" for the term. The familiarity score for 
a term was the percentage of students who indicated they 
knew the term. A similar procedure was used to get a 
familiarity score for a mathematical symbol. 

A sampling frame or list of mathematical terms to be 
tested (1165 terms in all) was compiled from pre-calcu- 
lus mathematics books. Pre-calculus texts were used 
because the authors were primarily interested in the 
readability for seventh and eighth graders of elementary 
and secondary materials. The phase "additive identity" 
and the words "additive" and "identity" would all be in- 
cluded in the frame. If familiarity of a word which ap- 
peared in a phrase was wanted, such as "acute" in "acute 
triangle", the word whose familiarity was to be determined 
was set between asterisks (*acute* triangle). From the 
list of 1165 terms approximately 5000 unique tests of 100 
items each were made up using a randomization program. 

The 154 symbols were obtained from the same textbooks 
and 9 tests of 36 symbols each were generated. Symbols 
were randomly selected and each symbol appeared in at 
least two of the tests. 

Seventh and eighth grade students from the United 
States were selected by what the authors call a propor- 
tionate stratified random sampling. From the sample of 
students approximately 350 responses were obtained for 
each term and 250 responses for each symbol. Measures 
of stability, level of agreement of scores, and other 
checks into the precision of the results were obtained. 



4 . Findings 

Frequency distributions of mathematical terms and 
mathematical symbols according to intervals of familiari- 
ty were presented. Lists of mathematical terms whose 
degree of familiarity were found to be between 90% and 
100% and between 80% and 90% were given. A list of sym- 
bols known by at least 70% of the students was also given. 






5. Interpretations 

The following observations were made. 

1. Students tended to distinguish between word 
forms with some precision. 



e.g. Word Form 
equal 
equation 
equality 
equate 



Familiarity 

92% 

83% 

72% 

24% 



2. Consistency of student responses and differ- 
entiation according to form in which they 
appear can be noted in the following example. 



Word Form 
commutative 
associative 
distributive 



Familiarity 

71% 

76% 

67% 



commutativity 

associativity 

distributivity 



44% 

39% 

38% 



Some rules for different word forms may have to be 
established. 

3. Students respond discriminatively to mathe- 
matical words used in different contexts. 

e.g. degree of an angle - 77% 

degree of a polynomial - 26% 

4. Familiarity measures for mathematics vocabu- 
lary (terms and symbols) now exist. 
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Abstracter's Notes 



The authors leave us with the question, "Will a meas- 
ure of vocabulary difficulty have predictive power in a 
readability formula?" Most readability formulas do con- 
tain vocabulary difficulty as a predictor variable and 
the care with which this research was carried out would 
indicate that the results will prove very useful in this 
connection. However, as the authors point out, the va- 
lidity of the measures they have developed is yet to be 
established. In any event it is almost certain that 
these measures will provide a useful guide for those in- 
volved in producing mathematics material for junior high 
school pupils at least. 



L. 0. Nelson 
University of Alberta 
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A TECHNIQUE FOR STUDYING CONCEPT FORMATION IN MATHEMATICS 
Collis, K. F., Journal for Research in Mathematics Education , 
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1 . Purpose 

Is there a potential usefulness for a particular card 
sorting task in the study of mathematical concept formation? 

2 . Rationale 



Card sorting tasks have long been used by psychologists to 
study "artificial" concept formation. It is proposed here that 
such tasks can be designed to study the formation of mathematical 
concepts taught in school mathematics. It is implied that the 
card sorting task suggested helps to meet "A fundamental 
methodological requirement. . .that the researcher be able to 
observe, record, and quantify the child's mathematical thinking 
without too much disturbing, disrupting, or distorting it." 

3 . Research Design and Procedure 

Based on an examination of Hubbard, Numbers in Relationship 
(Academy Press, 1964) and discussions with grade - teachers 
using the text , 56 items similar to the following were chosen 
and printed on 3x4 cards: 



1. 


3 x 4 = 4 x 3 


7. 


w - m = m - w 


19. 


y = ax + b 


20. 


b = y - ax 


23. 


5-5 


25. 


0x7 


28. 


2x = 8 


31. 


3x - 7 = 12 - 7 


40. 


2b = a 


42. 


b/a = 1/2 


48. 


x is integral 




b 




3x = 7 


• 

o 

in 


0 = 



According to the experimenter and a panel of four expert 
teachers, the items fell into the six categories: (1) commuta- 

tive principle with contrasting examples, (2) equivalent formula, 
(3) zero (with one contrast), (4) equation with one variable 
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x = 4, (5) ratio, and (6) impossible statements for these 
students. 

Two items were rejected as redundant. For the card sorting 
task each subject was given a pack of the 54 cards, instructed 
to lay them all face up on the table, and arrange the cards 
in any groups which seemed to go together. Cards which did 
not fit were kept separate. There was no time limit. The 
subject was not shown any category system and was free to make 
as many groups as she wanted. The experimenter recorded the 
items placed in each group and the number of groups for each 
subject. For a pilot study, five subjects were randomly selected 
from one grade 8 class in each of four convent girls schools. 
Each of the 20 subjects was given the card sorting task six 
times over a period of seven months in 1965. According to 
the experimenter , the student records were examined in two 
ways , 

(a) the development of what may be termed "pure" 
categories, that is, categories containing two or 
more cards but with no misconceptions or irrelevant 
cards included and (b) the patterns of category 
development that showed up upon inspection of the 
individual protocols. 

The experimenter was interested in the number of "pure" 
categories at each administration (graph suggests that the 
number of pure categories at each administration ranged from 
about 60 to 200) and the pattern of category development. 

The data were summarized descriptively. 

4. Findings 

The number of "pure" categories formed "increased" from first 
administration to last administration. Smaller categories 
were integrated in later administrations "in order to associate 
the cards concerned with a higher level principle." Based on 
the first three administrations the experimenter was "able to" 
predict the emergence of new categories as distinct groups in 
later administrations. No statistical tests of significance 
were used. 

5 . Interpretation 

The investigator concludes that 

a) designing, administering, and interpreting the 
results of the card sorting task are "skills 
already within the competence of the classroom 
mathematics teacher." 
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b) "the card-sorting technique would be of assistance 
to a teacher in tracing a child's conceptualization 
of the various principles in a particular mathe- 
matics course." 

c) the technique would offer teachers and psychologists 
in remedial education "an aid to determining the 
adequacy of child's cognitive functioning level... 
without intervention of... reading and written expres- 
sion..." variables. 

d) "the possibilities of the technique for use in 
educational research have been enhanced by develop- 
ments in the field of factor analysis that enable 
data gathered by the means described above to be 
analyzed more objectively." 



Abstracter's Notes 

It is important to emphasize that this is a feasibility 
study for a particular research technique. Some questions 
come to mind: 

1. Can this card-sorting task be analyzed in the 
experimental psychologist's terms? Is it in the 
reception or selection paradigm? Are the concepts 
conjunctive, disjunctive, conditional, bicondi- 
tional, or some more complicated combination? Is 
the task simply concept formation when it involves 
multiple concepts? What about relevant and ir- 
relevant attributes? 

2. Did the experimenter entertain the hypothesis that 
the changes he observed could be caused by the 
training provided by the task itself? What is the 
reliability of the instrument? 

3. Why were there so many (200) distinct pure categories 
at the sixth administration? What were they? 

4. What types of statistical analysis could be applied 
to the data available from the card-sorting tasks? 
Could they have been illustrated with the pilot 
data? The experimenter's advice would be welcome. 

5. What limitations does the experimenter see for the 
card-sorting task? 

Since this is a feasibility study we assume the findings 
reported in the pilot study are simply illustrative of some 
of the potential questions which could be asked. The value 
of the report is in the description of the card-sorting task 
and the suggestion that it may be useful in educational 
research. 



Richard J. Shumway 

The Ohio State University 
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1. Purpose 

\ 

To devise a set of prototypic tasks which would test 
various aspects of concept learning. To determine the 
effect of two instructional variables, number of instances 
and emphasis of relevant attribute values, on the per- 
formance of these tasks. 

Two hypotheses were tested: 1) that level of concept 

mastery would increase as a function of the increase in 
number of instances presented and 2) that emphasis of 
relevant attribute values would facilitate concept learning. 



2 . Rationale 



Concept learning research should be extended in sev- 
eral ways: 1) a wider range of concepts should be ex- 

amined, with careful specification of the essential char- 
acteristics of the concepts; 2) various instructional pro- 
cedures should be utilized, verbal as well as nonverbal 
strategies; 3) a set of differentiated response measures 
should be employed to assess both short-term and long-term 
retention; and 4) performance of subjects of different 
ages on the same task should be compared. 

The study attempts to deal with these four needs within 
the following framework. Fourth- and sixth-grade children 
were taught geometric concepts which bore complex inter- 
relationships to one another. A strategy for character- 
izing the concepts was developed. This consisted of de- 
termining the attributes relevant and irrelevant to the 



concept and of determining the relationships of each con- 
cept to the other. Concepts were taught by a combination 
of definitions and examples, with variation in number of 
examples and relative emphasis of relevant attribute 
values. Eleven tasks were identified as test items to 
measure attainment of each concept. 

The effect of number of instances presented on concept 
attainment has not been clearly established in previous 
research. 

Emphasis of relevant attribute values has been shown 
to improve concept learning but research has been on in- 
ductive tasks only. This study investigates the effect 
for deductive tasks. 



3. Research Design and Procedure 

The subjects were 154 fourth-grade and 126 sixth-grade 
children. The fourth-grade children comprised the entire 
fourth-grade population of one school in a midwestern 
suburban community. All sixth-grade children were from a 
middle school in the same community. These classes were 
selected from the total sixth-grade classes for convenience 
of scheduling. 

Instructional materials and tests were constructed and 
refined in a pilot study. They were developed from a set 
of behavioral objectives based on an analysis of cognitive 
processes in concept learning. Concepts used were the 
geometric concepts, quadrilateral , trapezoid , parallelo- 
gram , rectangle , rhombus , square and kite . 

Lessons designed to teach the concepts were similar to 
the usual school lessons but controlled the particular 
variables of interest, number of examples and emphasis of 
relevant attribute values. Combinations of these variables 
and counterbalancing resulted in eight different treatments. 
Within each of the classes subjects were randomly assigned 
to one of the eight treatment groups. Each group had four 
lessons, one on background, one or two on attributes, one 
or two on concepts. 

Each concept lesson had two positive and two negative 
instances. Thus half the groups got 4 instances, half 8 
instances . For half the groups the concept lessons also 
had questions directing attention to the relevant attri- 
bute values and a review of these relevant values. 

Tests were a multiple-choice test and a completion test 
developed from the pilot study tests. 

Experimenters were two graduate students, familiar with 
materials and procedure. 
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The design was a treatments X blocks design with sub- 
jects nested within class and treatments crossed with 
class. A two-way fixed effects analysis of variance model 
was assumed with the mean square error term as the denomi- 
nator of the F-ratio for both main effects and interaction. 
Independent variables were number of concept examples (4 
or 8) and emphasis on relevant attribute values (presence 
or absence of emphasis) . 



4 . Findings 

Reliability estimates (Hoyt) for the total multiple- 
choice test were .81 for grade 4 and .86 for grade 6; for 
the total completion test were .87 for grade 4 and .87 
for grade 6. 

Multivariate analyses of covariance were carried out 
for each grade level. Dependent variables were total score 
on the multiple- choice test (MT) and total score on the 
completion test (CT) . The covariate was the raw score on 
the Paragraph Meaning test of the SAT, in order to reduce 
variability due to differences in reading ability. The 
analysis revealed that the covariate had a highly signi- 
ficant correlation with the dependent variables. 

There was a significant variation among mean vectors 
over the six class groups. A t test indicated differ- 
ences were not due to the different experimenters but to 
differences among class groups. 

The variation in mean vectors due to emphasis of rele- 
vant attribute values was highly significant for Grade 4 
(p<0.0085 on MT, p<0.0076 on CT) . The effect of number 
of instances and the interaction between number of in- 
stances and emphasis of relevant attribute values were not 
significant. 

For Grade 6, there was no significant variation among 
mean vectors for any of the main effects or interactions. 



5. Inte rpret ations 

While item difficulties increased from task level to 
task level on the test, there is not sufficient evidence 
to suggest a hierarchy of task complexity. Refinements 
in tests and expansion of subtests might permit more ana- 
lytical differentiation of levels of concept mastery. 

Interference in tasks may have been due to different 
meanings previously associated with concept labels and to 
similarity of concept labels themselves. 
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The general lack of effect of the variable of number 
of concept Instances may be related to the fact that all 
groups received both positive and negative Instances. 
Greater effects might have resulted from use of positive 
instances only. 

The significant effect of emphasis of relevant attri 
bute values at Grade 4 and lack of effect at Grade 6 sug 
gest that the greatest effect of such emphasis is on the 
ability to correctly label attribute values. 



Abstracter's Notes 



This is a very well-designed and we 11- conceived study , 
focusing upon important variables in concept learning in 
mathematics. As is so often the case , however , the re- 
sults leave us with continuing uncertainties about the 
effects of the variables. The limits of time and other 
constraints imposed on research in the classroom make 
definitive answers extremely illusive. 

A factor which must have some effect on the results is 
the students ' background and exposure to the concepts 
taught. Even though they were not a part of the formal 
program previously, many of the concepts (squares, rec- 
tangles, etc.) were those to which nearly everyone is ex- 
posed informally in varying degree prior to Grade 4. This 
cannot be fully accounted for or controlled in experiments 
of this type. 

The question of an effect of number of instances is 
complicated or clouded by the equating of positive and 
negative instances. Previous experimentation in concept 
learning suggests we might look more closely at the par- 
ticular sequence, combination (perhaps ratio) of positive 
and negative instances before we can determine clearly the 
comparative effects of number of instances. 



Shirley A. Hill 
University of Missouri 
Kansas City 
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1. Purpose 



The purpose of this study was to investigate the 
effectiveness of spaced reviews in terms of retention of 
mathematical rule learning. Specifically, the investigator 
sought to determine 

a) the effects of one review on rule retention 

b) the effects of temporal position of one review 
on rule retention 

c) the effects of two reviews on rule retention 
(regardless of temporal position) 

d) the effects of temporal position of two reviews 
on rule retention. 

Three hypotheses were developed from a review of pre- 
vious research. They are in summary: 

a) One review will significantly enhance retention 
of rule learning 

b) Temporal position of one review will not have a 
significant effect on retention of rule learning 

c) One early and one late review will be more ef- 
fective than either two early reviews or two late 
reviews in strengthening retention of rule 
learning. 

A number of subsidiary questions were also investigated. 

2 . Rationale 



An extensive literature search was made of studies of 
the relationship of review to retention of both non- 
meaningful and meaningful learning. The latter studies 
usually employed reading passages. These studies of reten- 
tion of meaningful learning tended to support the following 
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generalizations : 



a) both early and late reviews seem to affect reten- 
tion equally, but for different reasons. An early 
review appears to promote consolidation of what 
has been learned; while a late review appears to 
promote relearning of what has been forgotten. 
Therefore, one would not expect differences in 
retention scores when comparing a group receiving 
an early review with a group receiving a late 
review . 

b) Retention does not vary when degree of original 
amount of learning is the same for all subjects. 

c) Spaced reviews or distributed practices are more 
effective than massed practice. 

d) One review will produce greater retention than no 
review and two reviews will produce nearly three 
times as much retention as one review. 

However, none of the earlier researchers had investi- 
gated the effects of review on retention of intellectual 
skill or concept learning such as mathematical rule 
learning. Nor had any previous investigation been made 
of the effects of temporal position of reviews on such 
learning . 

The author derived the purposes and hypotheses from 
implications of these previous studies . 

3. Research Design and Procedure 

This study consisted of two separate experiments as 
described below. 

Experiment I was designed to examine the temporal 
position effects of one review on retention of mathematical 
rule learning. The sample was composed of 53 grade eight 
subjects randomly assigned to four groups. 

Group 1 received one review one day after 
original learning 

Group 2 received one review one week after 
original learning 

Group 3 received one review two weeks after 
original learning 
Group 4 received no reviews 

Experiment II was designed to examine the temporal 
position effects of two reviews on retention of mathematical 
rule learning. The sample consisted of 67 grade seven 
subjects randomly assigned to four groups. 
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Group 1 received a review one day and two days after 
original learning 

Group 2 received a review one day and seven days 
after original learning 

Group 3 received a review six and seven days after 
original learning 

Group 4 received no reviews. 

All subjects in both experiments were taught four 
mathematical rules by a C.A.I. program. Two were algebraic 
rules: raising an algebraic expression to an indicated 

power; and determining the exponent of the product of 
indicated factors. Two were geometric rules: finding the 

measure of a third angle of a triangle when two are given; 
and finding a geometric mean. 

All subjects in both experiments were given the rules , 
shown examples, then asked to practice until they attained 
a success criterion of two successive correct solutions for 
each rule. 

Each review group in both experiments practiced until 
the same criterion level of success was attained. A delayed 
retention test was given to all subjects 21 days after 
initial learning. 

The following experimenter constructed tests consisting 
of eight items , two for each rule, were used. 

a) Pre-learning and post learning test 

b) Pre-review one and post review one test 

c) Pre-review two and post review two test 

d) Delayed retention test 

A record was made of the number of examples and the 
time required to reach mastery at each session. 

Data were analyzed by analysis of variance and co- 
variance techniques. 

4. Findings 

(a) For Experiment I 

(i) All review groups scored significantly higher 
than the no review group on the delayed 
retention test (p < .05) 

(ii) The effect of temporal position of one review 
was not significant 

(b) For Experiment II 

(i) All review groups scored significantly higher 
than the no review group on the delayed reten- 
tion test (p. < .01) 





22 



